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The Overlap operator fulfills the Ginsparg-Wilson relation exactly and therefore represents an op- 
timal discretization of the QCD Dirac operator with respect to chiral symmetry. When computing 
propagators or in HMC simulations, where one has to invert the overlap operator using some iter- 
ative solver, one has to approxomate the action of the sign function of the (symmetrized) Wilson 
fermion matrix Q on a vector b in each iteration. This is usually done iteratively using a 'primary' 
Lanczos iteration. In this process, it is very important to have good stopping criteria which allow 
to reliably assess the quality of the approximation to the action of the sign function computed so 
far In this work we show how to cheaply recover a secondary Lanczos process, starting at an ar- 
bitrary Lanczos vector of the primary process and how to use this secondary process to efficiently 
obtain computable error estimates and error bounds for the Lanczos approximations to sign(2)^, 
where the sign function is approximated by the Zolotarev rational approximation. 



XXIX International Symposium on Lattice Field Theory 
July 10-16, 2011 

Squaw Valley, Lake Tahoe, California 



* Speaker. 

^This work was supported by Deutsche Forschungsgemeinschaft through the Collaborative Research Centre SFB- 
TR 55 "Hadron Physics from Lattice QCD" 



© Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. 



http://pos.sissa.it/ 



Error Bounds for the Sign Function 



Andreas Frommer 



1. Introduction 

Overlap fermions as a lattice formulation of QCD respecting chiral symmetry have been pro- 
posed in ^ and been investigated since by many authors. The overlap operator still represents 
the discrete Dirac operator which most neatly deals with chiral symmetry, fulfilling the Ginsparg- 
Wilson relation on the lattice exactly. If describes the hopping part of the standard Wilson 
fermion matrix and its critical hopping parameter, the overlap operator is given as 



Do = I + p 75sign(e) with Q = 75 (/ - 




Herein, p is a mass parameter which is close to 1. 

A direct computation of sign(2) is not feasible, since Q is large and sparse, whereas sign(2) 
would be full. Therefore, numerical algorithms which invert systems with the matrix Dq have 
to follow an inner-outer paradigm: One performs an outer Krylov subspace method where each 
iteration requires the computation of a matrix- vector product involving sign (2). Each such product 
is computed through another, inner iteration using matrix-vector multiplications with Q. In this 
context, it is very important to be able to assess the accuracy of the computed approximation to 
sign(2)ft from the inner method, since one can steer the outer method so as to require less and less 
accurate computations of sign(2)Z?, resulting in substantial savings in computational work, see ||l|]. 

In this work we precisely consider the task of obtaining reliable error estimates and bounds 
when computing approximations for sign(2)Zj. Most preferably, we would like to have a precise 
upper bound, so that a stopping criterion based on that upper bound will guarantee that the exact 
error is below this bound. Actually, we will consider the case where the sign function sign(f) is 
approximated by a rational function g{t), the Zolotarev approximation. This approach has estab- 
Ushed itself as the method of choice, since the multishift eg method allows for an efficient update 
of the iterates, involving only short recurrencies and thus few memory [^. 

Usually, one fixes the rational Zolotarev approximation g{Q)b such that the error w.r.t. the 
sign function is less than ei on the spectrum of Q. An error bound £2 for the approximation of 
g{Q)b then results in an overall error bound £1 + £2 w.r.t. sign(2)Z7. 



2. Lanczos process and Lanczos approximations 

Assuming that vi € C" is normalized to ||vi II2 = 1, the Lanczos process computes orthonormal 
vectors vi , V2, . . . such that vi , . . . , form an orthonormal basis of the nested sequence of Krylov 



subspaces K,n{Q,v\), m= 1,2, It is given here as Algorithm gj_ 



The Lanczos process can be summarized via the Lanczos relation 

AV„, = V„,_|_ 1 r„,^ 1 „, = r„j + /3m • ej^ v,„+ 1 , (2.1) 
where V„, = [vi | . . . |v„,] € C"^"' is the matrix containing the Lanczos vectors, = (0, . . . ,0, 1)* € 
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Algorithm 2.1: Lanczos process with maztrix A and starting vector vi 


choose vi such that vi =1 


letj8o:=0, vq := 


for 7 = 1 , . . . , m do 








aj = v*wj 




Wj = wj - ajvj 




Pj=\\Wj\\2 




if pj = then stop 




Vj+l = 


end 



C" and 



' m+ 1 .m 



Pi Ol2 



G . 



[)(m+l)xm 



■•• '•• P,n-l 
Pm— 1 

pm 

with T,n a (real) symmetric tridiagonal matrix. 

Let ^(f) = Y!i=\ be the Zolotarev approximation to f^^/^. We get the m-th Lanczos approx- 
imation to g{Q^){Qb), which in turn approximates sign{Q)b, by running a multishift eg method, 

based on the Lanczos process, for the p systems {Q^ — Gil)xi = Qb. This is summarized as Al- 

r — I 9 (i) 

gorithm 2.2, where A = Q ,c = Qb. Herein, the factors p,„ are the scaling factors between the 

Lanczos vector and the residuals, see p|]: 



Qb-(^4l 



PmVm+i, andp,i/ 



i-m\rl!^\\2. 



(2.2) 



For the error of the m-th approximation x,„ we obtain 

P P P 



E - Oiiy'iQb) -x,„ = £ (OiiQ^ - Oil) 



(=1 



i=\ 



so we can express \\e„ 



as 



m+ 1 1 



igm(2^)vm+i, where 



(2.3) 



The elegant theory of moments and quadrature developed in ^] allows to bound this quantity, 
and more generally quantities of the form v*h{A)v, from below and from above by performing some 
steps of the Lanczos process for with starting vector v,„+i. The precise results is as follows: 



Theorem 1. Let 71 denote the tridiagonal matrix in the Lanczos relation (2.1) arising after k steps 
of the Lanczos process with starting vector v, \\v\\ = 1. Assume that h'.M.^M. is at least 2k + 2 
times continuously differentiable on an open set containing [a.,b\, where spec(A) C [a,^?]. 
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Algorithm 2.2: Multishift eg 



setx_i =0,pi'' = ||c||2, = 1, VI = (l/||c||)c 
for y = 0, 1, . . . do 

compute OCy+i, /3y+i, v^_|_2 using the Lanczos process for A for j = 1 do 
if ;>Otlien 



r(') 



a,-o, I Pj \ 1 



end 



end 

^ _yp JO. 



end 



(i) Approximating v*h{A)v with the Gauss quadrature rule using k nodes tj G {a,b) gives 

v*/j(A)v = e\h{T^)ei +R^[h], where = fk, 
with the error Rf[h] given as 



{2k)l 



f 

J a 



;=1 



dy{t), a<t,<b . 



(2.4) 



(ii) Approximating v*h{A)v with the Gauss-Radau quadrature rule using k—\ nodes tj G {a,b) 
with one additional node fixed at a gives 

v*h{A)v = e\h{T^^)e,+R^^[h]. 

Here, the tridiagonal matrix T^^ differs fi-om in that its {k, k) entry is replaced by 
Uk =a + 5k-i, where 5k-\ is the last entry of the vector 5 with {f^-i —al)5 = ^^_-^ek-i. The 
error R^^[h] is given as 



k-i 
.7=1 



d'Y{t), a<t,<b . 



(2.5) 



( Hi) Approximating v*h{A)v with the Gauss-Lobatto quadrature rule using k — 2 nodes tj G {a,b) 
and two additional nodes, one fixed at a and one fixed at b, gives 



v*h{A)v = e\h{T^^)ei+Rf^[h] 
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Here, the tridiagonal matrix T^^ differs from in its last column and row. With 5 and jl the 



solutions of the system {fk-\ —al)5 = ek-\, {fk-\ 
of the linear system 



bl)}JL = ek-\ and ak,[i^_i the solution 



1 


-4" 








a 


1 


-l^k 








b 



the tridiagonal matrix T^^ is obtained from by replacing ak by otyt and ^u-i by Pk-i- The 
error Rf^ [h] is given as 



Rf-ih] 



{2k-2)\ 



■a){t-b) 



k-2 



dy{t), a<t, <b . 



(2.6) 



W apply Theorem [l| to the rational functions h = g^ representing the error in (2.3) Inspecting 
the terms R^[h], R^^[h] and Rf-[h] and noticing that < (> 0) for ? € [0,oo) if I is odd 

(even), we get the following corollary. 



Corollary 1. In the case h{t) = gm{t)'^ with g^from (23), the estimates e\h{T^)e\and e\h{T^^)e\ 
from Theorem ^(i), (Hi) represent lower bounds, the estimate e\h{T^^)ei from (ii) represents an 
upper bound for the (square of the) error \\xm — 



3. Lanczos restart recovery 

To avoid ambiguities, let us call primary Lanczos process the one of the multishift eg method, 
i.e. the Lanczos process through which we obtain the approximations a,,,. The straightforward 
way to obtain the error estimates from Theorem [l] would be to perform k steps of a new, restarted 
Lanczos process which takes the current Lanczos vector v^+i of the primary process as its starting 
vector. This results in the restarted Lanczos relation 

AVl = Vl^rTl+i,u, (3.1) 

and we can now apply the theorem using the tridiagonal matrix T^ arising from the restarted pro- 
cess. This is, however, far too costly in practice: computing the error estimate would require k 
multiplications with A — approximately the same amount of work that we would need to advance 
the primary iteration from step m to m + k. 

Fortunately, it is possible to cheaply retrieve the matrix T^ of the secondary Lanczos process 
from the matrix T,y,^\+k of the primary Lanczos process. This Lanczos restart recovery opens the 
way to efficiently obtain all the error estimates from Theorem [I] in a retrospective manner: At 
iteration m + ^ we get the estimates for the error at iteration m without using any matrix- vector 
multiplications with A and with cost &{k^), independently of the system size n. 

For m = 0, 1, . . ., we define the tridiagonal matrix as the diagonal block of 7]„+i+j; 

ranging from rows and columns max{l,m+ 1 — ^} to m + \+k. So rl'^+i''^) is a {2k+ 1) x {2k+ 1) 
matrix, except for m + 1 <k, where its size is (m + l+ k) x {m+\+k). 

The following theorem, see |^, shows that for Lanczos restart recovery we basically have to 
run the Lanczos process for the tridiagonal matrix r('"+i'*^), starting with the ^+ 1st unit vector 
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Theorem 2. Let the Lanczos relation for k steps of the Lanczos process for 7" with starting 



vector ek+\ G ' 



(e,n+i G ifm+\<k)be given 



as 



T{m+l,k)y^ = Vt+l,kTk+l,k. 



Then the matrix T^,i of the restarted Lanczos relation {3J_) is given as 



^k+\,k 



Tk+i,k- 



(3.2) 



(3.3) 



The above theorem shows that we can retrieve T^_^_y ^ from T,„+ic-i-i^,„+k by performing k steps 
of the Lanczos process for the {2k + 1) x {2k + 1) tridiagonal matrix Herein, each step 

has work ff{k), so that the overall cost for computing T^^^ ^ is ff{k^). So we conclude that the total 
cost for computing the error estimates from Theorem |l| is also ff{k^). 



Algorithm 3.1: Lanczos approximation for Zolotarev function with error bounds 



set;c_i =0, po = ||^||2, To = 1 
choose k 

for m = 0, 1, . . . do 

compute ttm+i, j8„,+i, v,„^2 using the Lanczos process for A 



for i = \,... ,p do 
if m > then 



/* loop over poles */ 



1 



, o»— 1 



end 

Pm+\ — Pm a,„+y-ai 

(0 

■"-«!+ 1 



rVm+1 1 + ( 1 - T] 



end 

i(m> k then 

perform k steps of the Lanczos process for 7"('«^'=.'=) 
this yields the tridiagonal matrix tk G C*^^*^ 

4i-yt = \\gm{fk)ei\\2 

Um-k = \\gm{f^^)e i\\2 /* fk,!'^^ given in Theorem 0{ii) */ 



end 



end 



Algorithm |3JJ shows how we suggest to use the results exposed so far. It computes the Lanczos 
approximations Xm for g{A)b with g{t) = ^f^j and bounds £m-k, Um-k for the error at iteration 
m based on the Gauss and the Gauss-Radau rule. The Algorithm can be modified to also obtain 
error estimates or bounds based on the Gauss-Lobatto rule and to get bounds for the A-norm in case 
we deal with a linear system. 
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sign function in QCD, k=2, new method 



sign function in OCD, k=10, new method 
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Sign function in QCD, k=2, method from Frommer-Sin" 




Sign function in QCD, k=10, method from Frommer-Slmoncini 
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Figure 1: Error bounds and exact error for Zolotarev approximation for sign(2), 8"* lattice. Left column: 



k = 2, right column: k = 10. Top row: Algorithm 3.1 , bottom row: method from 



4. Numerical results 



In this section we repoit the results of several numerical experiments with relatively small lat- 
tices of size 8^ to 16^. In our computations we used the common deflation technique as described, 
e.g. in 1^: We precompute the first, Ai, . . . , say, eigenpairs of smallest modulus. With IT de- 
noting the orthogonal projection onto the space spanned by the corresponding eigenvectors, we 
then have sign(2)Zj = sign(2(/ — 11)^) + ?,ign{QYlb) . Herein, we know ?<ign{QYlb) explicitly, so 
that we now just have to approximate sign(2(7 — Yi)b). In this manner, we effectively shrink the 
eigenvalue intervals for Q, so that we need fewer poles for an accurate Zolotarev approximation 
and, in addition, the linear systems to be solved converge more rapidly. Within an iterative solver 
for the overlap operator this approach results in a major speedup, since sign(2)Z7 must usually be 



computed repeatedly for various vectors b. For Algorithm |3JJ it has the additional advantage that 
we immediately have a very good value for a, the lower bound on the smallest eigenvalue of 
for which we can take A^. In all our computations we deflated the smallest 30 eigenvalues, and we 
chose the Zolotarev approximation to have error less than 10"^. 

Figure [T] shows results for the 8^ configuration available in the matrix group QCD at the UFL 
sparse matrix collection as matrix conf 5 . 4-0018x8-2000 .mtx. This is a dynamicyally 
generated configuration at j8 = 5.4. The (effective) condition number of the (deflated) matrix 
is approximately 4.5 • 10^. The left column of the figure reports upper and lower bounds from 
Algorithm 3. 1 whereas the right columns gives the estimates from earlier work [fl] which are known 



know to be lower bounds. The top row takes k = 2m Algorithm 3.1 (and a similar parameter in 
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sign function in QCD, i^=1 0, new method 
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Figure 2: Error bounds and exact error for Zolotarev approximation for sign(2), 16^ lattice, Algorithm 3J_ 

the method from [Q]), and the bottom row refers to ^ = 10. We see that going from ^ = 2 to 10 
results in a significant gain in accuracy and that for k= \0 the upper and lower bounds just differ 
by a factor of 10. 

Figure |^ gives the results for Algorithm ^J] with ^ = 10 for a configuration on a 16^^ lattice. 
The configuration was the result of a quenched simulation. The condition number of the deflated 
matrix is now 64^, i.e. less than for the 8^ lattice. Therefore, the convergence speed as well as 
the quality of the bounds are better than for the 8^ lattice. 
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